PolyUCOMP: Combining Semantic Vectors with Skip bigrams for Semantic Textual Similarity
نویسندگان
چکیده
This paper presents the work of the Hong Kong Polytechnic University (PolyUCOMP) team which has participated in the Semantic Textual Similarity task of SemEval-2012. The PolyUCOMP system combines semantic vectors with skip bigrams to determine sentence similarity. The semantic vector is used to compute similarities between sentence pairs using the lexical database WordNet and the Wikipedia corpus. The use of skip bigram is to introduce the order of words in measuring sentence similarity.
منابع مشابه
PolyUCOMP-CORE_TYPED: Computing Semantic Textual Similarity using Overlapped Senses
The Semantic Textual Similarity (STS) task aims to exam the degree of semantic equivalence between sentences (Agirre et al., 2012). This paper presents the work of the Hong Kong Polytechnic University (PolyUCOMP) team which has participated in the STS core and typed tasks of SemEval2013. For the STS core task, the PolyUCOMP system disambiguates words senses using contexts and then determine sen...
متن کاملRepresenting Sentences as Low-Rank Subspaces
Sentences are important semantic units of natural language. A generic, distributional representation of sentences that can capture the latent semantics is beneficial to multiple downstream applications. We observe a simple geometry of sentences – the word representations of a given sentence (on average 10.23 words in all SemEval datasets with a standard deviation 4.84) roughly lie in a low-rank...
متن کاملA Bigram Extension to Word Vector Representation
GloVe is an algorithm which associates a vector to each word such that the dot product of two words corresponds to the likelihood they appear together in a large corpus ([PSM14]). GloVe vectors achieve state-of-the-art performance on word analogy tasks (v(king) − v(man) + v(woman) ≈ v(queen)), but they are limited to capturing meanings of individual words. In our project, we develop “biGloVe,” ...
متن کاملEbiquity: Paraphrase and Semantic Similarity in Twitter using Skipgrams
We describe the system we developed to participate in SemEval 2015 Task 1, Paraphrase and Semantic Similarity in Twitter. We create similarity vectors from two-skip trigrams of preprocessed tweets and measure their semantic similarity using our UMBC-STS system. We submit two runs. The best result is ranked eleventh out of eighteen teams with F1 score of 0.599.
متن کاملMayoClinicNLP-CORE: Semantic representations for textual similarity
The Semantic Textual Similarity (STS) task examines semantic similarity at a sentencelevel. We explored three representations of semantics (implicit or explicit): named entities, semantic vectors, and structured vectorial semantics. From a DKPro baseline, we also performed feature selection and used sourcespecific linear regression models to combine our features. Our systems placed 5th, 6th, an...
متن کامل